Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences.
Identifieur interne : 000A74 ( Main/Exploration ); précédent : 000A73; suivant : 000A75Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences.
Auteurs : Song Gao [Singapour] ; Wing-Kin Sung ; Niranjan NagarajanSource :
- Journal of computational biology : a journal of computational molecular cell biology ; 2011.
English descriptors
- KwdEn :
- MESH :
Abstract
Scaffolding, the problem of ordering and orienting contigs, typically using paired-end reads, is a crucial step in the assembly of high-quality draft genomes. Even as sequencing technologies and mate-pair protocols have improved significantly, scaffolding programs still rely on heuristics, with no guarantees on the quality of the solution. In this work, we explored the feasibility of an exact solution for scaffolding and present a first tractable solution for this problem (Opera). We also describe a graph contraction procedure that allows the solution to scale to large scaffolding problems and demonstrate this by scaffolding several large real and synthetic datasets. In comparisons with existing scaffolders, Opera simultaneously produced longer and more accurate scaffolds demonstrating the utility of an exact approach. Opera also incorporates an exact quadratic programming formulation to precisely compute gap sizes (Availability: http://sourceforge.net/projects/operasf/ ).
DOI: 10.1089/cmb.2011.0170
PubMed: 21929371
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream PubMed, to step Corpus: 000304
- to stream PubMed, to step Curation: 000304
- to stream PubMed, to step Checkpoint: 000307
- to stream Ncbi, to step Merge: 000903
- to stream Ncbi, to step Curation: 000903
- to stream Ncbi, to step Checkpoint: 000903
- to stream Main, to step Merge: 000A74
- to stream Main, to step Curation: 000A74
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences.</title>
<author><name sortKey="Gao, Song" sort="Gao, Song" uniqKey="Gao S" first="Song" last="Gao">Song Gao</name>
<affiliation wicri:level="4"><nlm:affiliation>NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore.</nlm:affiliation>
<country xml:lang="fr">Singapour</country>
<wicri:regionArea>NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore</wicri:regionArea>
<orgName type="university">Université nationale de Singapour</orgName>
</affiliation>
</author>
<author><name sortKey="Sung, Wing Kin" sort="Sung, Wing Kin" uniqKey="Sung W" first="Wing-Kin" last="Sung">Wing-Kin Sung</name>
</author>
<author><name sortKey="Nagarajan, Niranjan" sort="Nagarajan, Niranjan" uniqKey="Nagarajan N" first="Niranjan" last="Nagarajan">Niranjan Nagarajan</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PubMed</idno>
<date when="2011">2011</date>
<idno type="doi">10.1089/cmb.2011.0170</idno>
<idno type="RBID">pubmed:21929371</idno>
<idno type="pmid">21929371</idno>
<idno type="wicri:Area/PubMed/Corpus">000304</idno>
<idno type="wicri:Area/PubMed/Curation">000304</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000307</idno>
<idno type="wicri:Area/Ncbi/Merge">000903</idno>
<idno type="wicri:Area/Ncbi/Curation">000903</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">000903</idno>
<idno type="wicri:Area/Main/Merge">000A74</idno>
<idno type="wicri:Area/Main/Curation">000A74</idno>
<idno type="wicri:Area/Main/Exploration">000A74</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences.</title>
<author><name sortKey="Gao, Song" sort="Gao, Song" uniqKey="Gao S" first="Song" last="Gao">Song Gao</name>
<affiliation wicri:level="4"><nlm:affiliation>NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore.</nlm:affiliation>
<country xml:lang="fr">Singapour</country>
<wicri:regionArea>NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore</wicri:regionArea>
<orgName type="university">Université nationale de Singapour</orgName>
</affiliation>
</author>
<author><name sortKey="Sung, Wing Kin" sort="Sung, Wing Kin" uniqKey="Sung W" first="Wing-Kin" last="Sung">Wing-Kin Sung</name>
</author>
<author><name sortKey="Nagarajan, Niranjan" sort="Nagarajan, Niranjan" uniqKey="Nagarajan N" first="Niranjan" last="Nagarajan">Niranjan Nagarajan</name>
</author>
</analytic>
<series><title level="j">Journal of computational biology : a journal of computational molecular cell biology</title>
<idno type="e-ISSN">1557-8666</idno>
<imprint><date when="2011" type="published">2011</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Algorithms</term>
<term>Burkholderia pseudomallei (genetics)</term>
<term>Computer Simulation</term>
<term>Contig Mapping (methods)</term>
<term>Escherichia coli (genetics)</term>
<term>Genome</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Humans</term>
<term>Models, Genetic</term>
<term>Saccharomyces cerevisiae (genetics)</term>
<term>Sequence Analysis, DNA</term>
<term>Software</term>
</keywords>
<keywords scheme="MESH" qualifier="genetics" xml:lang="en"><term>Burkholderia pseudomallei</term>
<term>Escherichia coli</term>
<term>Saccharomyces cerevisiae</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en"><term>Contig Mapping</term>
</keywords>
<keywords scheme="MESH" xml:lang="en"><term>Algorithms</term>
<term>Computer Simulation</term>
<term>Genome</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Humans</term>
<term>Models, Genetic</term>
<term>Sequence Analysis, DNA</term>
<term>Software</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Scaffolding, the problem of ordering and orienting contigs, typically using paired-end reads, is a crucial step in the assembly of high-quality draft genomes. Even as sequencing technologies and mate-pair protocols have improved significantly, scaffolding programs still rely on heuristics, with no guarantees on the quality of the solution. In this work, we explored the feasibility of an exact solution for scaffolding and present a first tractable solution for this problem (Opera). We also describe a graph contraction procedure that allows the solution to scale to large scaffolding problems and demonstrate this by scaffolding several large real and synthetic datasets. In comparisons with existing scaffolders, Opera simultaneously produced longer and more accurate scaffolds demonstrating the utility of an exact approach. Opera also incorporates an exact quadratic programming formulation to precisely compute gap sizes (Availability: http://sourceforge.net/projects/operasf/ ).</div>
</front>
</TEI>
<affiliations><list><country><li>Singapour</li>
</country>
<orgName><li>Université nationale de Singapour</li>
</orgName>
</list>
<tree><noCountry><name sortKey="Nagarajan, Niranjan" sort="Nagarajan, Niranjan" uniqKey="Nagarajan N" first="Niranjan" last="Nagarajan">Niranjan Nagarajan</name>
<name sortKey="Sung, Wing Kin" sort="Sung, Wing Kin" uniqKey="Sung W" first="Wing-Kin" last="Sung">Wing-Kin Sung</name>
</noCountry>
<country name="Singapour"><noRegion><name sortKey="Gao, Song" sort="Gao, Song" uniqKey="Gao S" first="Song" last="Gao">Song Gao</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Musique/explor/OperaV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000A74 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000A74 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Musique |area= OperaV1 |flux= Main |étape= Exploration |type= RBID |clé= pubmed:21929371 |texte= Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences. }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i -Sk "pubmed:21929371" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd \ | NlmPubMed2Wicri -a OperaV1
This area was generated with Dilib version V0.6.21. |